Big Data Classification Using Distributed Optimized Hoeffding Trees
نویسندگان
چکیده
منابع مشابه
Big Data Classification Using Augmented Decision Trees
We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble methods, the models produced by the algorithm can be easily interpreted. The algorithm is based on a divide and conquer strategy and consists of two steps. Th...
متن کاملTackling Simpson's Paradox in Big Data using Classification & Regression Trees
This work is aimed at finding potential Simpson’s paradoxes in Big Data. Simpson’s paradox (SP) arises when choosing the level of data aggregation for causal inference. It describes the phenomenon where the direction of a cause on an effect is reversed when examining the aggregate vs. disaggregates of a sample or population. The practical decision making dilemma that SP raises is which level of...
متن کاملAdaptive Anomaly Intrusion Detection System Using Optimized Hoeffding Tree
Anomaly Intrusion Detection System is used to identify a new attack in the network by identifying the deviations in the network traffic patterns. Though it identifies new attacks efficiently, the false alarm rate is usually high in this system. As there may be attack in the network at any time and as the input traffic varies over time, we need a model which efficiently identifies the change in ...
متن کاملA Ensembles of Restricted Hoeffding Trees
The success of simple methods for classification shows that is is often not necessary to model complex attribute interactions to obtain good classification accuracy on practical problems. In this paper, we propose to exploit this phenomenon in the data stream context by building an ensemble of Hoeffding trees that are each limited to a small subset of attributes. In this way, each tree is restr...
متن کاملAccurate Ensembles for Data Streams: Combining Restricted Hoeffding Trees using Stacking
The success of simple methods for classification shows that is is often not necessary to model complex attribute interactions to obtain good classification accuracy on practical problems. In this paper, we propose to exploit this phenomenon in the data stream context by building an ensemble of Hoeffding trees that are each limited to a small subset of attributes. In this way, each tree is restr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Machine Intelligence
سال: 2017
ISSN: 2377-2220
DOI: 10.21174/jomi.v2i1.101